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Many data management applications must deal with data which is uncertain, incom¬ 
plete, or noisy. However, on existing uncertain data representations, we cannot tractably 
perform the important query evaluation tasks of determining query possibility, certainty, 
or probability: these problems are hard on arbitrary uncertain input instances. We thus ask 
whether we could restrict the structure of uncertain data so as to guarantee the tractability 
of exact query evaluation. We present our tractability results for tree and tree-like uncertain 
data, and a vision for probabilistic rule reasoning. We also study uncertainty about order, 
proposing a suitable representation, and study uncertain data conditioned by additional 
observations. 


1 Introduction 

Traditional database management theory assumes that data is correct and complete. However, more and 
more applications deal with incomplete, uncertain, and noisy data. For instance, data is extracted or in¬ 
ferred automatically from random Web pages by automated and error-prone extraction programs ifTSll ; 
integrated from diverse sources through approximate mappings 1221 ; contributed to collaboratively ed¬ 
itable knowledge bases 1461 by untrustworthy users; or deduced from the imprecise answers of random 
workers on crowdsourcing platforms |[9l[39l. 

Various kinds of uncertainty can hold on the data, which influences our choice of how to represent 
it. The best known is fact uncertainty, we are dealing with statements for which we do not know 
whether they are correct or incorrect. However, there are other situations, such as order uncertainty: 
we are interested in an order relation on facts (e.g., time, relevance) or on the objects (e.g., preference, 
quality), and we only have partial information about this order (e.g., it was obtained from conflicting 
user preferences, or by integrating event sequences that are not synchronized). 

The straightforward way to extend existing data management paradigms to uncertain data is to repre¬ 
sent explicitly all possible states of the data (which we call possible worlds), and to define the semantics 
of queries as returning all answers that can be obtained on the possible worlds. Of course, this simple 
scheme is not practical: there are often exponentially many possible worlds, so we cannot represent 
them all, much less query them. Fortunately, the possible worlds are often structured, e.g., by indepen¬ 
dence or decomposability assumptions. This encourages us to design representation systems, which 
concisely describe a collection of possible worlds, and evaluate queries directly on the representation, 
to return a representation of all possible results. 

Of course, querying uncertain data implies that, in general, query results will themselves be uncer¬ 
tain. Still, they have many uses. They allow us to determine whether some answers are possible, or 
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certain', or to estimate which ones are likely, based on a probabilistic model on the underlying source 
of uncertainty (e.g., the trustworthiness of sources). We can also use them to specialize the result of 
the query, without reevaluating it from scratch, if we ever obtain information that lifts some of the 
uncertainty. For instance, when we have access to human users (e.g., via the crowd), we can use the 
uncertain query results to estimate which additional knowledge would help reduce the uncertainty, and 
ask them the right questions to make the query output more crisp. 

We have thus defined semantics for uncertain data. Yet, this does not tell us whether we can manage it 
tractably. Sadly, in general, this is not the case. For example, in the context of fact uncertainty, consider 
the framework of tuple-independent (or TID) instances Il36l . which are the simplest kind of probabilistic 
relational instances: all facts are independently present or absent with a given probability. Consider the 
conjunctive query (CQ) q : 3xy/?(x)S(x,y)r(y). It is #P-hard lfT9l to compute the probability that q 
holds on an input TID instance, and this is a data complexity result, i.e., it is only in the instance, even 
when the query is assumed to be fixed. This confrasfs wifh fhe AC® dafa complexify ||2l of CQs on 
fradifional insfances, and makes if necessary in pracfice fo approximafe query resulfs via sampling. In 
ofher confexfs, e.g., order uncerfainfy, or uncerfain informalion fhaf was parfly disambiguafed using 
crowd answers, we do nol even know whefher fhere are good represenfafion sysfems. 

The goal of my PhD is fo address fhis problem from a fheorefical angle, idenlifying sifuafions where 
fhe structure of uncerfain dafa ensures fhe fracfabilify of exacf query evaluafion in ferms of possibilify, 
necessify, and probabilify. In ofher words, my goal is fo show fhaf exacf query evaluafion is fracfable 
when we make assumpfions on fhe data: on fhe sfrucfure of fhe underlying facfs, on fhe kind of un- 
cerfainfy, and on ifs sfrucfure (e.g., facf correlafions). The hope would be fo identify fracfable classes 
covering pracfical examples of uncertain dafa, and achieve a fheorefical undersfanding of why and how 
we can fracfably query fhem. 

The main focus is on facf uncertainly, which is sludied in Seclion |2] We firsl sludy free represenla- 
fions of dafa, in fhe conlexl of probabilistic XML ||35l, giving examples of how a fracfabilify resulf for 
local uncertainly models on frees ifTTl can be generalized fo global uncertainly models where fhe scopes 
of uncerfain evenls have bounded overlap. We fhen move fo relalional represenfalions, and explain how 
fhe XML fracfabilify resulfs generalize fo uncertain relational insfances fhaf have bounded treewidth, 
in fhe sense of having a simulfaneous bounded-widfh decomposition of Iheir underlying inslance and 
Iheir uncerfainfy annolalions. We fhen describe perspectives fo extend fhis resulf, fhe main one being 
fhe problem of reasoning under uncertain rules. 

My second focus (in Section [3]l is on order uncertainly. Affer mofivafing fhis problem, I review our 
currenl resulfs of defining a bag semantics for fhe positive relalional algebra on uncerfain ordered dafa, 
and give perspeclives such as managing order uncertainly arising from uncerfain numerical values. As 
a Ihird focus. Section |4]sludies fhe quesfion of conditioning uncertain dafa, e.g., by inlegraling crowd 
answers fo reduce fhe uncerfainfy. Section [5]concludes. 

2 Fact Uncertainty 

2.1 Trees 

We sfarl wifh free represenfalions of dafa, i.e., XML documenls. Figure [Uillusfrafes such a documenl 
(ignoring for now fhe annolalions): if describes pari of fhe Wiki data Btil entry about Chelsea Manning. 

As the data on Wiki data is not always correct, the information contained in our tree is uncertain-, 
we use the PrXML probabilistic XML formalism ll^ to represent this. For instance, in Figure [T] 
the ind node describes that the “occupation” subtree may or may not be present, with a probability of 
0.4, independently from all other nodes, modeling our uncertainty about whether this information is 
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given name 


surname place of birth 


ind 


mux 



I I 

cie cie occupation 

^Jane ^Jane 


Bradley Chelsea Manning Crescent musician 
Figure 1: Example PrXML document 


Event 

Prob. 

^Jane 

0.9 


correct. The mux node represents our estimation of the probability that the given name is “Bradley” or 
“Chelsea”; mux nodes, unlike ind nodes, allow choices that are mutually exclusive. 

ind and mux nodes can represent local uncertainty: indeed, all of their choices have to be taken 
independently, and only affect their descendants. As it turns out, query evaluation on trees, in the sense 
of determining the probability that a query holds, is tractable under local uncertainty of this kind ifTTl . 
for the usual tree query languages such as tree-pattern queries or monadic second-order (MSO) queries 
without joins. Tree documents with local uncertainty are thus an example of structurally tractable 
uncertain data. 

However, not all uncertainty sources can be modeled with local uncertainty. Say that the place of 
birth and surname facts were added to the Wikidata entry by user Jane. Rather than modelling them 
as being independent, we would like to represent them in a correlated fashion: either user Jane is 
trustworthy, and both facts are likely to be true, or she is a vandal, and both are unlikely. To model this, 
we use uncertain events, which are a form of global uncertainty. As a simple example, in Figure [H the 
event ej^m indicates that we fully trust Jane with probability .9, and the cie nodes (for “conjunction of 
independent events”) indicate that the place of birth and surname facts are either both present or both 
absent, depending on whether we trust Jane. As events can be reused at any point in the document, 
they can introduce correlations between arbitrary document parts, so that query evaluation is generally 
intractable with events ll34l . 

This hardness result is not surprising if events are used indiscriminately, but are there safe ways to 
use them without leading to intractability? In fTj we have answered this question in the affirmative, 
by introducing the notion of event scopes, and stating the first (to our knowledge) non-trivial sufficient 
condition that guarantees the tractability of query evaluation on PrXML trees with events. Intuitively, 
the scope of an event is the set of nodes where the value of this event must be “remembered” when 
trying to evaluate a query on the tree; in Figure [T] the scope of ejane are the nodes “surname“ and “place 
of birth” and their descendants. The scope of a node n is the set of events having n in their scope. We 
showed that for PrXML documents where the scope of all nodes have size bounded by a constant, the 
evaluation of a fixed MSO query can be performed in PTIME in fhe inpuf documenf. 

In facf, fhis claim follows from much more general resulfs abouf sfrucfurally fracfable insfances. We 
now furn fo fhis. 

2.2 Tree-Like Data 

When dafa cannof easily be represented as a free, a nafural way fo write if is fo use relational dafabases 
(or insfances) 12]. We can fhen represenf uncertain dafa using fhe formalism of c-instances 1321 l29l . 
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From 

To 

Annotation 

Paris, CDG 

Melbourne, MEL 

pods 

Melbourne, MEL 

Paris, CDG 

pods A -istoc 

Melbourne, MEL 

Portland, PDX 

pods A stoc 

Paris, CDG 

Portland, PDX 

-ipods A stoc 

Portland, PDX 

Paris, CDG 

stoc 


Table 1: Example c-instance 


which augments relational instances with propositional annotations on facts using Boolean events, each 
event valuation defining a possible world obtained by retaining only the facts whose annotation eval¬ 
uates to true. An example c-instance is given as Table [T] describing which trips should be booked 
depending on the conferences that a researcher wishes to attend: PODS is taking place in Melbourne 
and STOC in Portland. We can use pc-instances ll^ [SH to model probabilistic distributions on in¬ 
stances, simply by giving independent probabilities to the events of the c-instance. 

There are several query languages for relational instances and (p)c-instances: existentially quantified 
conjuncfions of atoms (known as conjunctive queries or CQs), MSO queries, Dafalog 0, or some of ifs 
varianfs such as frontier-guarded Dafalog ifTTIl . However, we know fhaf evaluafing a fixed CQ is already 
#P-hard in dafa complexify, even on TIDs |[36l which are much less expressive fhan pc-insfances. 

Yef, as we saw in fhe previous section, hardness does nol necessarily hold for free-shaped dafa. Could 
we fhen show fhe fracfabilify of query answering on TIDs which are assumed fo be free-shaped? In facf, 
we can show Q fhaf fracfabilify holds for TID insfances of bounded treewidth Il42ll . which infuifively 
requires fhaf fhey are close to a free. 

Theorem 1 Defining the freewidfh of a TID as that of its underlying relational instance (forgetting 
about the probabilities), for input TIDs with treewidth bounded by a constant, the evaluation of a fixed 
MSO query can be performed in PTIME data complexity. The complexity drops to linear time if we 
assume constant-time arithmetic operations. 

This resulf cannof direcfly generalize fo pc-fables, because fhey allow arbifrary propositional annofa- 
fions on facls, so CQ evaluation is #P-hard in dafa complexify even on single-facf pc-insfances. Hence, 
fo cover pc-fables as well, we would need fo limif fhe expressiveness of annofafions. Our idea is fo 
wrife annofafions as Boolean circuifs rafher fhan formulae, and look af fhe freewidfh of fhe annofafion 
circuif. We can show fT] fhaf fracfabilify does not follow from bounded freewidfh of fhe insfance and of 
fhe circuif in isolafion; rafher, we musf require fhe exisfence of a bounded-widfh free decomposition of 
the instance and circuit, which respects the link between circuit gates and the facts that they annotate. 
We call those bounded-treewidth pcc-instances, and we can show: 

Theorem 2 Evaluating a fixed MSO query on bounded-treewidth pcc-instances has PTIME or linear¬ 
time data complexity (depending on the cost of arithmetic operations). 

This general result implies Theorem [T] and the scope-based tree tractability results of the previous 
section, as these formalisms can be rewritten to bounded-treewidth pcc-instances. It relates to Cour- 
celle’s theorem fTSl for usual relational instances, which shows that MSO queries (which are generally 
NP-hard) can be evaluated in linear-time data complexity if we assume constant treewidth. To show 
this, one compiles Il45l the MSO query q, in a data-independent fashion, to a tree automaton A which 
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can read tree encodings of bounded-treewidth instances and determine whether they satisfy q. We fol¬ 
low the same approach, but we show that A can also be run on an uncertain instance /, producing a 
lineage circuit C that describes which possible worlds of I are accepted by A. We then show that C 
has bounded treewidth, and so the probability that I satisfies q can be computed from C via standard 
message passing techniques lITTl . Thus, bounded-treewidth pcc-instances are structurally tractable. 

Our method relates to CQ evaluation methods on probabilistic instances which compute a lineage 
of the query and evaluate the probability of that lineage. This line of related work has proven fruitful, 
e.g., to identify a dichotomy |[20l between safe and unsafe queries (depending on the data complexity 
of evaluating them on TID instances). Our approach is different: we assume a restriction on the data, 
namely bounded treewidth, and show that the lineages that we obtain are always tractable, for any query 
that can be compiled to an automaton: beyond CQs, this covers MSO, frontier-guarded Datalog, and 
more generally guarded second-order queries. Also, our lineages are circuits rather than formulae, and 
are constructed from an automaton for the query rather than an execution plan. We use this to cast a 
new light on semiring provenance: in the case of monotone queries, our lineage circuits are provenance 
circuits ||2T|| matching standard definitions of semiring provenance If28l for absorptive semirings. We 
show this by connecting the automaton to a new intrinsic dehnition of provenance for the query. 

Of course, our assumption of bounded-treewidth means that we do not cover many practical use 
cases, beyond tree-shaped data. We could address this from a theoretical angle, as we do not know 
yet whether Theorem |2] generalizes to weaker assumptions such as bounded clique-width or hypertree- 
width 1241. However, in more pragmatic terms, we hope to extend our result to partial tree decomposi¬ 
tions: we would structure uncertain instances as a high-treewidth core and low-treewidth tentacles, and 
evaluate queries by combining Theorem |2] on the tentacles and sampling-based approximate methods 
on the core. The assumption is that real-world uncertain data, while it may not have bounded treewidth, 
should have large low-treewidth parts that can be dealt with using our exact approach. Approximate 
query evaluation would then be restricted to the core, and could thus be made faster or more accurate. A 
similar idea (in a more restricted context) was recently studied in |[38l . where it was shown to improve 
the performance of source-to-target query evaluation on uncertain graphs. 

Another point that we intend to study, in terms of practical applicability, is the question of combined 
complexity. Indeed, compiling MSO queries to automata is generally non-elementary in the query. One 
possibility around this would be to adapt the construction to monadic Datalog |[26l ; another one would 
be to investigate the performance of practical automata compilation techniques |[30l . 

2.3 Reasoning Under Probabilistic Rules 

We conclude our study of structural tractability for tuple uncertainty, by describing our vision for 
tractable reasoning under probabilistic rules. 

When evaluating queries on incomplete knowledge bases (KBs) such as Wikidata |46l, we may miss 
some answers because the corresponding facts are absent from the KB. However, if we know some hard 
constraints about the KB (e.g., the “located in” relation is transitive), it makes more sense to say that a 
query is true if it is certain under the constraints, namely, if it is satished by all completions of the KB 
that obey the constraints. This is called open world query answering, and it generalizes standard query 
evaluation (which is the case where there are no rules). 

Our claim is that it would be more useful to reason under soft rules, i.e., probabilistic rules. For 
instance, if the birth date of a person is missing from the KB, we can deduce a likely range for the 
date using any other fact about the person. Likewise, a citizen of a country often lives in that country, 
and probably speaks the official language of the country. Such rules could be produced by association 
rule mining |Ul, or using KB-specihc methods Il23ll . Of course, some of the facts that they imply may 


5 


be wrong, but on average we expect them to help reduce incompleteness in the KB. Hence, we would 
hope to obtain better query answers by asking for the likely answers under many uncertain rules, rather 
than the certain answers under a few hard rules. 

There are already several approaches to reason under uncertain rules, such as probabilistic program¬ 
ming languages (e.g., ProbLog Il40l l. or solutions based on Markov Logic Networks flTI (e.g., ||33l). 
We would need an approach that satisfies some desiderata. First, it should be able to express rules 
which assert the (probable) existence of new elements, or nulls, e.g., a PhD student and their advi¬ 
sor have probably co-authored some paper (which may be unknown to the KB). This is not possible 
with approaches that focus on vanilla Datalog rules (e.g., lUl), and requires existential Datalog, or 
Datalog+/^, as is done, e.g., in ll25l . 

Second, unlike Il25]l . the approach should be able to express rules that usually apply, not rules which 
have a certain probability of always applying. For instance, if we say that citizens of a country are 
born there with 80% probability, the semantics of ESll is that the rule is either always true or always 
false, with probability 80%. Our desired semantics is that the rule applies, on average, in 80% of cases. 
Maybe closest to our requirements is IfT^ . but the focus of this work is purely declarative, leaving open 
the question of the tractability of query answering tasks for such a model. 

Of course, formalizing our desired semantics for probabilistic rules raises many challenging ques¬ 
tions. First, there may be multiple independent ways to deduce the same fact, so determining the 
overall probabilities of new facts is tricky, especially as there may be correlations, and cyclic deriva¬ 
tions where facts are deduced via a path that involve themselves. Second, the possible consequences 
of the rules may be infinite, so that there may be infinitely many possible worlds to consider (unlike, 
e.g., pc-instances). We hope to formalize such a semantics by a variant of the chase [Jl, yielding both 
a probabilistic process to generate possible worlds, and a reasoning process to describe the possible 
lineages of facts. Alternatively, another possibility would be eliminate some rules by rewriting them 
into the query. 

The other challenge posed by probabilistic rules is the question of tractability. For some languages 
(e.g., guarded Datalog ll^ with terminating chase), we hope to preserve treewidth-based tractability 
guarantees from the instance to the rule consequences. If the chase does not terminate, a possibility 
would be to represent it as a recursive Markov chain ifTSll . or to truncate it and control the error. 

Beyond guarded rules, it would be practically useful to support equality constraints, number restric¬ 
tions (e.g., “people have at most two parents”), or closed-world domains: for instance, when we deduce 
that a person has a country of residence, the country probably already exists in the knowledge base, 
rather than being a fresh null. However, we do not know which distribution to assume on such reuses, 
and we fear that our criteria for tractability would no longer apply if such reuses are possible. 


3 Order Uncertainty 

We now leave the standard setting of fact uncertainty and move to order uncertainty: we want to model 
data where we are unsure about the order between facts or data items. In this setting, to justify the 
tractability of uncertain data, we need to invent the right representation systems to model the uncertain 
data and the query output. Of course, uncertain order relations between elements and tuples could in 
principle be modeled as fact uncertainty, but this would ignore the structure of the uncertainty: it would 
create many facts and correlations, leaving little hope for tractability. 

Yet, there are many scenarios where order uncertainty is specifically needed. For insfance, consider 
the problem of integrating lists of items that are ordered by an unknown criterion, e.g., a proprietary 
relevance function, or the preferences of various users 14311 . If we wish to take the union of these lists. 
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or to look at pairs (e.g., choices of a hotel and restaurant in the same neighborhood), there are multiple 
reasonable choices to order the result of such operations while respecting the order constraints im¬ 
posed by the original lists. The same problem can arise when integrating logged events from different 
machines or files, where the log entries are sequentially ordered but do not mention a global times¬ 
tamp (e.g., logs of the fetchmail program, or /var/log/dmesgon Unix systems); or when integrating 
concurrent edits to a document in a version control system ifTOll . The same problem can occur when 
searching for the top-k most frequent itemsets in data mining: if we only have an incomplete view of 
the data to mine, as in our study of data mining on the crowd |4h we need to reason under incomplete 
information about the order relation on the support value of itemsets. 

We have studied this problem and proposed a bag semantics for the positive relational algebra that 
applies to relations with uncertain order @, which relies on labeled partial orders as its representation 
system. Again, we observe that many tasks on the resulting representations are intractable to solve: for 
instance, given a labeled partial order, we cannot tractably determine whether an input total order is 
one of its possible worlds. Yet, for this problem, some specific sfrucfures of parfial orders are fracfable, 
such as fhe ones fhaf were consfrucfed on unordered relations, or fofally ordered relations (depending 
on fhe semantics of operators). 

Many questions on order uncertainly are still open. For inslance, if would be nice lo specify a com- 
posilional semanlics for fhe order manipulation operators of SQL, to formalize all possible reasonable 
behaviors of SQL implemenfafions. However, we would need to extend our represenlalion system lo 
more operalors, and lo sel semantics as well as bag semantics. Il would also be interesting to extend 
our approach to allow bolh fad and order uncerlainly, for inslance by extending our conslruclions to 
supporl provenance. 

Anolher challenge is to extend our uncerlain model to a probabilistic model, bul doing so for order 
uncerlainly is harder lhan going, e.g., from c-lables to pc-lables. How can we define a probabilily 
dislribulion on Ihe possible ways to order Ihe dala? One possibilily is to sludy order lhal arises from 
numerical values (e.g., supporl, in our dala mining scenario). We have initial ideas Q, bul Ihere are a 
lol of open questions left. Some are definitional: Whal are Ihe possible worlds? Whal is our besl guess 
on how to interpolate missing numerical values on partially ordered dala? Olhers are operational: even 
counting Ihe possible worlds of partially ordered dala may be inlraclable |[T4l . 


4 Conditioning 

Lasl, we lum to dala lhal has been conditioned Il44l : sterling wilh an original uncertain date inslance, we 
have revised il to force Ihe oulcome of certain probabilistic evenls, given new observations or additional 
information. 

The motivation for Ihis kind of uncerlain date is very general, because uncertain date can often be 
made more cerlain if we are ready to pay Ihe price. For inslance, we can often ask a human expert to 
verify whelher a fad is really Irue, or whelher an evenl occurred or not If we do so, we musl figure oul 
Iwo Ihings: which question to ask, and how to incorporate Ihe answer to our uncerlain model. 

The answer integration step already poses a problem of Iraclabilily: for inslance, we can easily 
condition a c-inslance to indicate lhal an event is Irue, bul il is much harder to force a/ad annotation to 
be Irue, as il can be an arbilrary formula. Furlher, we do nol know al all whelher slruclural Iraclabilily 
guarantees on Ihe original inslance can be preserved by conditioning. We have good hopes for Ibis to 
be possible, as existing work in Ihe probabilistic XML conlexl has shown lhal il is Iraclable to query a 
documenl lhal has been conditioned using a specific language of conslrainls ifT^ ; note, however, lhal 
Ibis work does nol allempl to conslrucl an aclual probabilistic XML documenl lhal would represenl Ihe 
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distribution obtained by conditioning. 

An entirely different issue is to deal with the first step of choosing which query to ask. It is tricky 
to even define what the best question is, and even harder to find a sensible definition that is tractable 
to evaluate. The most relevant study of this issue may come from crowd data sourcing: when we try 
to extract knowledge from a crowd of human users, we are never sure about what we know, because 
we can never fully trust the answers that have been produced by the crowd workers. Yet, from our 
current knowledge and our current estimation of the likely answers, we must decide what is the next 
question that we should ask to the crowd O, to reduce our uncertainty on the final answer. To our 
knowledge, however, existing crowd data sourcing techniques |[39l use very ad-hoc representations 
which are specific fo some simple query types. 

Hence, it is an important challenge to design a generic uncertainty representation framework suitable 
for such iterative scenarios: at each step, the data is conditioned based on our observations, and we need 
to choose the queries that we intend to make, relative to their cost. Beyond crowdsourcing, we believe 
that our vision of such a system [ J] applies to many situations that involve a tradeoff between spending 
more resources and acquiring more knowledge. 


5 Conclusion 

We have presented our results about how to deal with order uncertainty, and fact uncertainty on tree and 
tree-like instances. We have presented many perspectives to extend them: for instance, representing the 
consequences of uncertain deduction rules, or the result of conditioning the existing data with additional 
information. 

There are interesting directions left to explore. An important one would be to evaluate the practical 
applicability of what we propose, on datasets or for concrete tasks involving uncertain ordered data or 
low-treewidth data. The design of a practical implementation would also raise theoretical questions: 
How to combine our methods with approximate methods such as sampling? Which optimizations 
would help us deal with the high combined complexity? 

In terms of representations, we hope to understand how order and fact uncertainty can be combined, 
and whether the result could be extended to cover more uncertainty types, such as the result of condi¬ 
tioning. Indeed, we believe that a fundamental challenge for uncertain data representation is to support 
dynamic situations, where the data can evolve: new facts are extracted, deduction rules are fired, and 
existing information is disambiguated and clarified through human queries or complex processing. De¬ 
signing such a framework would be both a theoretical and a practical challenge. 
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